Chapter 25
IN THIS CHAPTER
Quickly estimating sample size for several basic statistical tests
Adjusting for different levels of power and α
Adjusting for unequal group sizes and for attrition during the study
Sample-size calculations (also called power calculations) tend to frighten researchers and send them running to the nearest statistician. But if you you need a ballpark idea of how many participants are needed for a new research project, you can use these ten quick and dirty rules of thumb.
The first six sections tell you how many participants you need to provide complete data for you to analyze in order to have an 80 percent chance of getting a p value that’s less than 0.05 when you run the test if a true difference of your effect size does indeed exist. In other words, we are setting the parameters 80 percent power at α = 0.05, because they are widely used in biological research. The remaining four sections tell you how to modify your estimate for other power or α values, and how to adjust your estimate for unequal group size and dropouts from the study.
participants in each group, or
participants altogether.For example, say you’re comparing two hypertension drugs — Drug A and Drug B — on lowering systolic blood pressure (SBP). You might set the effect size of 10 mmHg. You also know from prior studies that the SD of the SBP change is known to be 20 mmHg. Then the equation is
, or 0.5, and you need
, or 64 participants in each group (128 total).
participants in each group.Continuing the example from the preceding section, if you’re comparing three hypertension drugs — Drug A, Drug B, and Drug C — and if any mean difference of 10 mmHg in SBP between any pair of drug groups is important, then E is still
, or 0.5, but you now need
, or 80 participants in each group (240 total).
participants (pairs of values).Imagine that you’re studying test scores in struggling students before and after tutoring. You determine a six-point improvement in grade points is the effect size of importance, and the SD of the changes is ten points. Then
, or 0.6, and you need
, or about 22 students, each of whom provides a before score and an after score.
and
) that you’re comparing. You also have to calculate the average of the two proportions:
.
participants in each group.For example, if a disease has a 60 percent mortality rate, but you think your drug can cut this rate in half to 30 percent, then
, or 0.45, and
, or 0.3. You need
, or 44 participants in each group (88 total).
participants (pairs of values).Imagine that you’re studying the association between weight and blood pressure, and you want the correlation test to come out statistically significant if these two variables have a true correlation coefficient of at least 0.2. Then you need to study
, or 200 participants.
.Here’s how the formula works out for several values of HR greater than 1:
Hazard Ratio |
Total Number of Events |
|---|---|
1.1 |
3,523 |
1.2 |
963 |
1.3 |
465 |
1.4 |
283 |
1.5 |
195 |
1.75 |
102 |
2.0 |
67 |
2.5 |
38 |
3.0 |
27 |
Here’s how you take a sample-size estimate that provides 80 percent power from one of the preceding rules and scale it up or down to provide some other power:
For example, if you know from doing a prior sample size calculation that a study with 70 participants provides 80 percent power to test its primary objective, then a study that has
, or 93 participants will have about 90 percent power to test the same objective. The reason to consider power of levels other than 80 percent is because of limited sample. If you know that 70 participants provides 80 percent power, but you will only have access to 40, you can estimate maximum power you are able to achieve.
Here’s how you take a sample-size estimate that was based on testing at the α = 0.05 level, and scale it up or down to correspond to testing at some other α level:
For example, imagine that you’ve calculated you need a sample size of 100 participants using α = 0.05 as your criterion for significance. Then your boss says you have to apply a two-fold Bonferroni correction (see Chapter 11) and use α = 0.025 as your criterion instead. You need to increase your sample size to 100 x 1.2, or 120 participants, to have the same power at the new α level.
When comparing means or proportions between two groups, you usually get the best power for a given sample size — meaning it’s more efficient — if both groups are the same size. If you don’t mind having unbalanced groups, you will need more participants overall in order to preserve statistical power. Here’s how to adjust the size of the two groups to keep the same statistical power:
Suppose that you’re comparing two equal-sized groups, Drug A and Drug B. You’ve calculated that you need two groups of 32, for a total of 64 participants. Now, you decide to randomize group assignment using a 2:1 ratio for A:B. To keep the same power, you’ll need
, or 48 for Drug A, an increase of 50 percent. For B, you’ll want
, or 24, a decrease of 25 percent, for an overall new total 72 participants in the study.
Sample size estimates apply to the number of participants who give you complete, analyzable data. In reality, you have to increase this estimate to account for those who will drop out of the study, or provide incomplete data for other reasons (called attrition). Here’s how to scale up your sample size estimate to develop an enrollment target that compensates for attrition, remembering longer duration studies may have higher attrition:
Enrollment = Number Providing Complete Data × 100/(100 – %Attrition)
Here are the enrollment scale-ups for several attrition rates:
Expected Attrition |
Increase the Enrollment by |
|---|---|
10% |
11% |
20% |
25% |
25% |
33% |
33% |
50% |
50% |
100% |
If your sample size estimate says you need a total of 60 participants with complete data, and you expect a 25 percent attrition rate, you need to enroll
, or 80 participants. That way, you’ll have complete data on 60 participants after a quarter of the original 80 are removed from analysis.